蛋白质通过折叠到特定的3D结构来执行生物学功能。为了准确地模拟蛋白质结构,应仔细考虑氨基酸(例如侧链扭转角度和氨基酸际方向)之间的总体几何拓扑和局部细粒关系。在这项工作中,我们提出了定向的体重神经网络,以更好地捕获不同氨基酸之间的几何关系。我们的新框架将单个重量从标量扩大到3D定向矢量,支持经典和SO(3)的丰富几何操作(3) - 表示特征,在其上,我们构建了一个可用于处理氨基酸的感知器单元信息。此外,我们还引入了一条蛋白质上的范式传递范式,以将定向权重的感知器插入现有的图形神经网络中,从而显示出在全球尺度上保持SO(3) - 均衡性方面的较高多功能性。实验表明,与经典的神经网络和(全球)模棱两可的网络相比,我们的网络在表示几何关系方面具有更好的表现力。它还在与蛋白质3D结构有关的各种计算生物学应用上实现最新性能。
translated by 谷歌翻译
反转合是药物发现的主要任务。通过许多现有方法,它被称为生成图的问题。具体而言,这些方法首先识别反应中心,并相应地打破靶分子以生成合成子。反应物是通过顺序添加到合成图或直接添加正确的离开组来生成反应物。但是,两种策略都遭受了添加原子以来会导致长期的预测顺序,从而增加了产生难度,同时添加离开组只能考虑训练集中的序列,从而导致概括不佳。在本文中,我们提出了一个新颖的端到端图生成模型,用于逆转录合成预测,该模型顺序识别反应中心,生成合成子,并将基序添加到合成子中以生成反应物。由于化学有意义的基序比原子大,比离开组还小,因此与添加原子相比,与添加离开组相比,我们的方法的预测复杂性较低。基准数据集上的实验表明,所提出的模型显着胜过先前的最新算法。
translated by 谷歌翻译
动机:医学图像分析涉及帮助医师对病变或解剖结构进行定性和定量分析的任务,从而显着提高诊断和预后的准确性和可靠性。传统上,这些任务由医生或医学物理学家完成,并带来两个主要问题:(i)低效率; (ii)受个人经验的偏见。在过去的十年中,已经应用了许多机器学习方法来加速和自动化图像分析过程。与受监督和无监督的学习模型的大量部署相比,在医学图像分析中使用强化学习的尝试很少。这篇评论文章可以作为相关研究的垫脚石。意义:从我们的观察结果来看,尽管近年来增强学习逐渐增强了动力,但医学分析领域的许多研究人员发现很难理解和部署在诊所中。一个原因是缺乏组织良好的评论文章,针对缺乏专业计算机科学背景的读者。本文可能没有提供医学图像分析中所有强化学习模型的全面列表,而是可以帮助读者学习如何制定和解决他们的医学图像分析研究作为强化学习问题。方法和结果:我们从Google Scholar和PubMed中选择了已发表的文章。考虑到相关文章的稀缺性,我们还提供了一些出色的最新预印本。根据图像分析任务的类型对论文进行仔细审查和分类。我们首先回顾了强化学习的基本概念和流行模型。然后,我们探讨了增强学习模型在具有里程碑意义的检测中的应用。最后,我们通过讨论审查的强化学习方法的局限性和可能的​​改进来结束这篇文章。
translated by 谷歌翻译
在合作的多代理增强学习(MARL)中,代理只能获得部分观察,有效利用本地信息至关重要。在长期观察期间,代理可以构建\ textit {意识},使队友减轻部分可观察性问题。但是,以前的MAL方法通常忽略了对本地信息的这种利用。为了解决这个问题,我们提出了一个新颖的框架,多代理\ textit {本地信息分解,以意识到队友}(linda),代理商通过该框架学会分解本地信息并为每个队友建立意识。我们将意识模拟为随机随机变量并执行表示学习,以确保意识表示的信息,通过最大程度地提高意识与相应代理的实际轨迹之间的相互信息。 Linda对特定算法是不可知论的,可以灵活地集成到不同的MARL方法中。足够的实验表明,所提出的框架从当地的部分观察结果中学习了信息丰富的意识,以更好地协作并显着提高学习绩效,尤其是在具有挑战性的任务上。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments. To tackle this issue, we present ongoing work and preliminary findings on the possibility for deep models to transfer knowledge from language to music, by finetuning large language models pre-trained on a massive text corpus on only hundreds of MIDI files of drum performances. We show that by doing so, one of the largest, state-of-the-art models (GPT3) is capable of generating reasonable drum grooves, while models that are not pre-trained (Transformer) shows no such ability beyond naive repetition. Evaluating generated music is a challenging task, more so is evaluating drum grooves with little precedence in literature. Hence, we propose a tailored structural evaluation method and analyze drum grooves produced by GPT3 compared to those played by human professionals, exposing the strengths and weaknesses of such generation by language-to-music transfer. Our findings suggest that language-to-music transfer learning with large language models is viable and promising.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译